Overview

Brought to you by YData

Dataset statistics

Number of variables9
Number of observations11495243
Missing cells862
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.1 GiB
Average record size in memory569.4 B

Variable types

Text7
Categorical1
Unsupported1

Alerts

titleType is highly imbalanced (61.9%) Imbalance
tconst has unique values Unique
isAdult is an unsupported type, check if it needs cleaning or further analysis Unsupported

Reproduction

Analysis started2025-03-04 04:26:04.218505
Analysis finished2025-03-04 04:32:08.296218
Duration6 minutes and 4.08 seconds
Software versionydata-profiling vv4.12.2
Download configurationconfig.json

Variables

tconst
Text

Unique 

Distinct11495243
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size729.1 MiB
2025-03-03T23:32:14.429273image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Length

Max length10
Median length10
Mean length9.5099734
Min length9

Characters and Unicode

Total characters109319455
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11495243 ?
Unique (%)100.0%

Sample

1st rowtt0000001
2nd rowtt0000002
3rd rowtt0000003
4th rowtt0000004
5th rowtt0000005
ValueCountFrequency (%)
tt0000019 1
 
< 0.1%
tt9916880 1
 
< 0.1%
tt0000001 1
 
< 0.1%
tt0000002 1
 
< 0.1%
tt0000003 1
 
< 0.1%
tt0000004 1
 
< 0.1%
tt0000005 1
 
< 0.1%
tt0000006 1
 
< 0.1%
tt0000007 1
 
< 0.1%
tt0000008 1
 
< 0.1%
Other values (11495233) 11495233
> 99.9%
2025-03-03T23:32:20.661334image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 22990486
21.0%
1 11177578
10.2%
2 10594685
9.7%
0 9381525
8.6%
4 8731344
 
8.0%
3 8659163
 
7.9%
8 8401135
 
7.7%
6 8375193
 
7.7%
5 7260022
 
6.6%
7 6940354
 
6.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 109319455
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
t 22990486
21.0%
1 11177578
10.2%
2 10594685
9.7%
0 9381525
8.6%
4 8731344
 
8.0%
3 8659163
 
7.9%
8 8401135
 
7.7%
6 8375193
 
7.7%
5 7260022
 
6.6%
7 6940354
 
6.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 109319455
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
t 22990486
21.0%
1 11177578
10.2%
2 10594685
9.7%
0 9381525
8.6%
4 8731344
 
8.0%
3 8659163
 
7.9%
8 8401135
 
7.7%
6 8375193
 
7.7%
5 7260022
 
6.6%
7 6940354
 
6.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 109319455
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
t 22990486
21.0%
1 11177578
10.2%
2 10594685
9.7%
0 9381525
8.6%
4 8731344
 
8.0%
3 8659163
 
7.9%
8 8401135
 
7.7%
6 8375193
 
7.7%
5 7260022
 
6.6%
7 6940354
 
6.3%

titleType
Categorical

Imbalance 

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size715.3 MiB
tvEpisode
8840212 
short
1047478 
movie
 
708001
video
 
306953
tvSeries
 
277952
Other values (6)
 
314647

Length

Max length12
Median length9
Mean length8.2459039
Min length5

Characters and Unicode

Total characters94788669
Distinct characters20
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowshort
2nd rowshort
3rd rowshort
4th rowshort
5th rowshort

Common Values

ValueCountFrequency (%)
tvEpisode 8840212
76.9%
short 1047478
 
9.1%
movie 708001
 
6.2%
video 306953
 
2.7%
tvSeries 277952
 
2.4%
tvMovie 150118
 
1.3%
tvMiniSeries 60182
 
0.5%
tvSpecial 51593
 
0.4%
videoGame 42180
 
0.4%
tvShort 10573
 
0.1%

Length

2025-03-03T23:32:20.720039image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
tvepisode 8840212
76.9%
short 1047478
 
9.1%
movie 708001
 
6.2%
video 306953
 
2.7%
tvseries 277952
 
2.4%
tvmovie 150118
 
1.3%
tvminiseries 60182
 
0.5%
tvspecial 51593
 
0.4%
videogame 42180
 
0.4%
tvshort 10573
 
0.1%

Most occurring characters

ValueCountFrequency (%)
o 11105516
11.7%
e 10817505
11.4%
v 10597883
11.2%
i 10557556
11.1%
t 10448683
11.0%
s 10225824
10.8%
d 9189345
9.7%
p 8891805
9.4%
E 8840212
9.3%
r 1396185
 
1.5%
Other values (10) 2718155
 
2.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 94788669
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 11105516
11.7%
e 10817505
11.4%
v 10597883
11.2%
i 10557556
11.1%
t 10448683
11.0%
s 10225824
10.8%
d 9189345
9.7%
p 8891805
9.4%
E 8840212
9.3%
r 1396185
 
1.5%
Other values (10) 2718155
 
2.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 94788669
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 11105516
11.7%
e 10817505
11.4%
v 10597883
11.2%
i 10557556
11.1%
t 10448683
11.0%
s 10225824
10.8%
d 9189345
9.7%
p 8891805
9.4%
E 8840212
9.3%
r 1396185
 
1.5%
Other values (10) 2718155
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 94788669
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 11105516
11.7%
e 10817505
11.4%
v 10597883
11.2%
i 10557556
11.1%
t 10448683
11.0%
s 10225824
10.8%
d 9189345
9.7%
p 8891805
9.4%
E 8840212
9.3%
r 1396185
 
1.5%
Other values (10) 2718155
 
2.9%
Distinct5168903
Distinct (%)45.0%
Missing19
Missing (%)< 0.1%
Memory size869.2 MiB
2025-03-03T23:32:21.770629image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Length

Max length458
Median length405
Mean length19.866783
Min length1

Characters and Unicode

Total characters228373121
Distinct characters203
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4712861 ?
Unique (%)41.0%

Sample

1st rowCarmencita
2nd rowLe clown et ses chiens
3rd rowPoor Pierrot
4th rowUn bon bock
5th rowBlacksmith Scene
ValueCountFrequency (%)
episode 4829052
 
12.7%
the 1176448
 
3.1%
dated 940667
 
2.5%
459592
 
1.2%
of 404479
 
1.1%
a 321779
 
0.8%
and 254342
 
0.7%
in 232152
 
0.6%
to 190123
 
0.5%
2 150494
 
0.4%
Other values (1413762) 29064003
76.4%
2025-03-03T23:32:22.825975image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26527098
 
11.6%
e 20058625
 
8.8%
i 13152108
 
5.8%
o 12968983
 
5.7%
a 11415061
 
5.0%
s 11180633
 
4.9%
d 10119771
 
4.4%
r 8221351
 
3.6%
t 8134487
 
3.6%
n 8107950
 
3.6%
Other values (193) 98487054
43.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 228373121
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
26527098
 
11.6%
e 20058625
 
8.8%
i 13152108
 
5.8%
o 12968983
 
5.7%
a 11415061
 
5.0%
s 11180633
 
4.9%
d 10119771
 
4.4%
r 8221351
 
3.6%
t 8134487
 
3.6%
n 8107950
 
3.6%
Other values (193) 98487054
43.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 228373121
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
26527098
 
11.6%
e 20058625
 
8.8%
i 13152108
 
5.8%
o 12968983
 
5.7%
a 11415061
 
5.0%
s 11180633
 
4.9%
d 10119771
 
4.4%
r 8221351
 
3.6%
t 8134487
 
3.6%
n 8107950
 
3.6%
Other values (193) 98487054
43.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 228373121
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
26527098
 
11.6%
e 20058625
 
8.8%
i 13152108
 
5.8%
o 12968983
 
5.7%
a 11415061
 
5.0%
s 11180633
 
4.9%
d 10119771
 
4.4%
r 8221351
 
3.6%
t 8134487
 
3.6%
n 8107950
 
3.6%
Other values (193) 98487054
43.1%
Distinct5193902
Distinct (%)45.2%
Missing19
Missing (%)< 0.1%
Memory size870.6 MiB
2025-03-03T23:32:23.858509image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Length

Max length458
Median length405
Mean length19.86421
Min length1

Characters and Unicode

Total characters228343539
Distinct characters190
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4738067 ?
Unique (%)41.2%

Sample

1st rowCarmencita
2nd rowLe clown et ses chiens
3rd rowPauvre Pierrot
4th rowUn bon bock
5th rowBlacksmith Scene
ValueCountFrequency (%)
episode 4828988
 
12.7%
the 1123849
 
3.0%
dated 940666
 
2.5%
460601
 
1.2%
of 383714
 
1.0%
a 313452
 
0.8%
and 247572
 
0.7%
in 226118
 
0.6%
to 187747
 
0.5%
de 151384
 
0.4%
Other values (1450636) 29135816
76.7%
2025-03-03T23:32:24.964856image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26504698
 
11.6%
e 20002760
 
8.8%
i 13200140
 
5.8%
o 12960285
 
5.7%
a 11496781
 
5.0%
s 11186560
 
4.9%
d 10122621
 
4.4%
r 8197867
 
3.6%
n 8135441
 
3.6%
t 8097489
 
3.5%
Other values (180) 98438897
43.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 228343539
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
26504698
 
11.6%
e 20002760
 
8.8%
i 13200140
 
5.8%
o 12960285
 
5.7%
a 11496781
 
5.0%
s 11186560
 
4.9%
d 10122621
 
4.4%
r 8197867
 
3.6%
n 8135441
 
3.6%
t 8097489
 
3.5%
Other values (180) 98438897
43.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 228343539
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
26504698
 
11.6%
e 20002760
 
8.8%
i 13200140
 
5.8%
o 12960285
 
5.7%
a 11496781
 
5.0%
s 11186560
 
4.9%
d 10122621
 
4.4%
r 8197867
 
3.6%
n 8135441
 
3.6%
t 8097489
 
3.5%
Other values (180) 98438897
43.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 228343539
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
26504698
 
11.6%
e 20002760
 
8.8%
i 13200140
 
5.8%
o 12960285
 
5.7%
a 11496781
 
5.0%
s 11186560
 
4.9%
d 10122621
 
4.4%
r 8197867
 
3.6%
n 8135441
 
3.6%
t 8097489
 
3.5%
Other values (180) 98438897
43.1%

isAdult
Unsupported

Rejected  Unsupported 

Missing0
Missing (%)0.0%
Memory size353.8 MiB
Distinct152
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size666.0 MiB
2025-03-03T23:32:25.088740image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Length

Max length4
Median length4
Mean length3.7520616
Min length2

Characters and Unicode

Total characters43130860
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st row1894
2nd row1892
3rd row1892
4th row1892
5th row1893
ValueCountFrequency (%)
n 1425056
 
12.4%
2021 507635
 
4.4%
2022 490335
 
4.3%
2018 457527
 
4.0%
2023 454328
 
4.0%
2019 453913
 
3.9%
2017 451143
 
3.9%
2020 435600
 
3.8%
2016 426978
 
3.7%
2015 401704
 
3.5%
Other values (142) 5991024
52.1%
2025-03-03T23:32:25.259446image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 11303024
26.2%
0 10420534
24.2%
1 7297795
16.9%
9 3997466
 
9.3%
\ 1425056
 
3.3%
N 1425056
 
3.3%
8 1406881
 
3.3%
7 1300194
 
3.0%
3 1189419
 
2.8%
4 1173229
 
2.7%
Other values (2) 2192206
 
5.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 43130860
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2 11303024
26.2%
0 10420534
24.2%
1 7297795
16.9%
9 3997466
 
9.3%
\ 1425056
 
3.3%
N 1425056
 
3.3%
8 1406881
 
3.3%
7 1300194
 
3.0%
3 1189419
 
2.8%
4 1173229
 
2.7%
Other values (2) 2192206
 
5.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 43130860
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2 11303024
26.2%
0 10420534
24.2%
1 7297795
16.9%
9 3997466
 
9.3%
\ 1425056
 
3.3%
N 1425056
 
3.3%
8 1406881
 
3.3%
7 1300194
 
3.0%
3 1189419
 
2.8%
4 1173229
 
2.7%
Other values (2) 2192206
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 43130860
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2 11303024
26.2%
0 10420534
24.2%
1 7297795
16.9%
9 3997466
 
9.3%
\ 1425056
 
3.3%
N 1425056
 
3.3%
8 1406881
 
3.3%
7 1300194
 
3.0%
3 1189419
 
2.8%
4 1173229
 
2.7%
Other values (2) 2192206
 
5.1%
Distinct97
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size647.1 MiB
2025-03-03T23:32:25.356010image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.0238279
Min length1

Characters and Unicode

Total characters23264393
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st row\N
2nd row\N
3rd row\N
4th row\N
5th row\N
ValueCountFrequency (%)
n 11358288
98.8%
2019 7351
 
0.1%
2018 7249
 
0.1%
2017 7183
 
0.1%
2020 7158
 
0.1%
2021 7016
 
0.1%
2022 6574
 
0.1%
2023 6003
 
0.1%
2016 5740
 
< 0.1%
2024 4965
 
< 0.1%
Other values (87) 77716
 
0.7%
2025-03-03T23:32:25.503161image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
\ 11358288
48.8%
N 11358288
48.8%
2 151696
 
0.7%
0 139684
 
0.6%
1 97361
 
0.4%
9 60014
 
0.3%
8 22136
 
0.1%
7 18922
 
0.1%
6 15484
 
0.1%
3 14663
 
0.1%
Other values (2) 27857
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 23264393
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
\ 11358288
48.8%
N 11358288
48.8%
2 151696
 
0.7%
0 139684
 
0.6%
1 97361
 
0.4%
9 60014
 
0.3%
8 22136
 
0.1%
7 18922
 
0.1%
6 15484
 
0.1%
3 14663
 
0.1%
Other values (2) 27857
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 23264393
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
\ 11358288
48.8%
N 11358288
48.8%
2 151696
 
0.7%
0 139684
 
0.6%
1 97361
 
0.4%
9 60014
 
0.3%
8 22136
 
0.1%
7 18922
 
0.1%
6 15484
 
0.1%
3 14663
 
0.1%
Other values (2) 27857
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 23264393
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
\ 11358288
48.8%
N 11358288
48.8%
2 151696
 
0.7%
0 139684
 
0.6%
1 97361
 
0.4%
9 60014
 
0.3%
8 22136
 
0.1%
7 18922
 
0.1%
6 15484
 
0.1%
3 14663
 
0.1%
Other values (2) 27857
 
0.1%
Distinct958
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size646.6 MiB
2025-03-03T23:32:25.631096image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Length

Max length23
Median length2
Mean length1.9859531
Min length1

Characters and Unicode

Total characters22829013
Distinct characters43
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique280 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row5
3rd row5
4th row12
5th row1
ValueCountFrequency (%)
n 7843462
68.2%
30 340014
 
3.0%
60 255433
 
2.2%
22 198479
 
1.7%
45 105179
 
0.9%
15 102319
 
0.9%
25 82346
 
0.7%
44 82268
 
0.7%
23 76341
 
0.7%
10 76338
 
0.7%
Other values (948) 2333064
 
20.3%
2025-03-03T23:32:25.830016image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 7843465
34.4%
\ 7843462
34.4%
2 1210004
 
5.3%
0 1096238
 
4.8%
1 1008186
 
4.4%
3 777093
 
3.4%
4 767648
 
3.4%
5 724935
 
3.2%
6 524981
 
2.3%
8 371069
 
1.6%
Other values (33) 661932
 
2.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 22829013
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 7843465
34.4%
\ 7843462
34.4%
2 1210004
 
5.3%
0 1096238
 
4.8%
1 1008186
 
4.4%
3 777093
 
3.4%
4 767648
 
3.4%
5 724935
 
3.2%
6 524981
 
2.3%
8 371069
 
1.6%
Other values (33) 661932
 
2.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 22829013
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 7843465
34.4%
\ 7843462
34.4%
2 1210004
 
5.3%
0 1096238
 
4.8%
1 1008186
 
4.4%
3 777093
 
3.4%
4 767648
 
3.4%
5 724935
 
3.2%
6 524981
 
2.3%
8 371069
 
1.6%
Other values (33) 661932
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 22829013
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 7843465
34.4%
\ 7843462
34.4%
2 1210004
 
5.3%
0 1096238
 
4.8%
1 1008186
 
4.4%
3 777093
 
3.4%
4 767648
 
3.4%
5 724935
 
3.2%
6 524981
 
2.3%
8 371069
 
1.6%
Other values (33) 661932
 
2.9%

genres
Text

Distinct2385
Distinct (%)< 0.1%
Missing824
Missing (%)< 0.1%
Memory size744.8 MiB
2025-03-03T23:32:25.925328image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Length

Max length32
Median length28
Mean length10.943463
Min length2

Characters and Unicode

Total characters125788746
Distinct characters37
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique212 ?
Unique (%)< 0.1%

Sample

1st rowDocumentary,Short
2nd rowAnimation,Short
3rd rowAnimation,Comedy,Romance
4th rowAnimation,Short
5th rowShort
ValueCountFrequency (%)
drama 1298543
 
11.3%
comedy 751676
 
6.5%
talk-show 724994
 
6.3%
news 598531
 
5.2%
documentary 554411
 
4.8%
drama,romance 524295
 
4.6%
n 506003
 
4.4%
reality-tv 366213
 
3.2%
adult 314423
 
2.7%
news,talk-show 260224
 
2.3%
Other values (2375) 5595106
48.7%
2025-03-03T23:32:26.073802image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 13323990
 
10.6%
m 9981059
 
7.9%
o 9574172
 
7.6%
r 8416267
 
6.7%
e 8411950
 
6.7%
, 6838538
 
5.4%
y 5832618
 
4.6%
t 5792858
 
4.6%
i 4831394
 
3.8%
n 4506612
 
3.6%
Other values (27) 48279288
38.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 125788746
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 13323990
 
10.6%
m 9981059
 
7.9%
o 9574172
 
7.6%
r 8416267
 
6.7%
e 8411950
 
6.7%
, 6838538
 
5.4%
y 5832618
 
4.6%
t 5792858
 
4.6%
i 4831394
 
3.8%
n 4506612
 
3.6%
Other values (27) 48279288
38.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 125788746
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 13323990
 
10.6%
m 9981059
 
7.9%
o 9574172
 
7.6%
r 8416267
 
6.7%
e 8411950
 
6.7%
, 6838538
 
5.4%
y 5832618
 
4.6%
t 5792858
 
4.6%
i 4831394
 
3.8%
n 4506612
 
3.6%
Other values (27) 48279288
38.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 125788746
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 13323990
 
10.6%
m 9981059
 
7.9%
o 9574172
 
7.6%
r 8416267
 
6.7%
e 8411950
 
6.7%
, 6838538
 
5.4%
y 5832618
 
4.6%
t 5792858
 
4.6%
i 4831394
 
3.8%
n 4506612
 
3.6%
Other values (27) 48279288
38.4%

Missing values

2025-03-03T23:31:38.628126image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/
A simple visualization of nullity by column.
2025-03-03T23:31:43.611954image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-03-03T23:31:52.740062image/svg+xmlMatplotlib v3.10.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

tconsttitleTypeprimaryTitleoriginalTitleisAdultstartYearendYearruntimeMinutesgenres
0tt0000001shortCarmencitaCarmencita01894\N1Documentary,Short
1tt0000002shortLe clown et ses chiensLe clown et ses chiens01892\N5Animation,Short
2tt0000003shortPoor PierrotPauvre Pierrot01892\N5Animation,Comedy,Romance
3tt0000004shortUn bon bockUn bon bock01892\N12Animation,Short
4tt0000005shortBlacksmith SceneBlacksmith Scene01893\N1Short
5tt0000006shortChinese Opium DenChinese Opium Den01894\N1Short
6tt0000007shortCorbett and Courtney Before the KinetographCorbett and Courtney Before the Kinetograph01894\N1Short,Sport
7tt0000008shortEdison Kinetoscopic Record of a SneezeEdison Kinetoscopic Record of a Sneeze01894\N1Documentary,Short
8tt0000009movieMiss JerryMiss Jerry01894\N45Romance
9tt0000010shortLeaving the FactoryLa sortie de l'usine Lumière à Lyon01895\N1Documentary,Short
tconsttitleTypeprimaryTitleoriginalTitleisAdultstartYearendYearruntimeMinutesgenres
11495233tt9916838tvEpisodeEpisode #3.13Episode #3.1302009\N\NDrama
11495234tt9916840tvEpisodeHorrid Henry's Comic CaperHorrid Henry's Comic Caper02014\N11Adventure,Animation,Comedy
11495235tt9916842tvEpisodeEpisode #3.16Episode #3.1602009\N\NDrama
11495236tt9916844tvEpisodeEpisode #3.15Episode #3.1502009\N\NDrama
11495237tt9916846tvEpisodeEpisode #3.18Episode #3.1802009\N\NDrama
11495238tt9916848tvEpisodeEpisode #3.17Episode #3.1702009\N\NDrama
11495239tt9916850tvEpisodeEpisode #3.19Episode #3.1902010\N\NDrama
11495240tt9916852tvEpisodeEpisode #3.20Episode #3.2002010\N\NDrama
11495241tt9916856shortThe WindThe Wind02015\N27Short
11495242tt9916880tvEpisodeHorrid Henry Knows It AllHorrid Henry Knows It All02014\N10Adventure,Animation,Comedy